Method and system for reducing risk values discrepancies between categories

ABSTRACT

The present teaching generally relates to removing perturbations from predictive scoring. In one embodiment, data representing a plurality of events detected by a content provider may be received, the data indicating a time that a corresponding event occurred and whether the corresponding event was fraudulent. First category data may be generated by grouping each event into one of a number of categories, each category being associated with a range of times. A first measure of risk for each category may be determined, where the first measure of risk indicates a likelihood that a future event occurring at a future time is fraudulent. Second category data may be generated by processing the first category data and a second measure of risk for each category may be determined. Measure data representing the second measure of risk for each category and the range of times associated with that category may be stored.

CROSS REFERENCE TO RELATED APPLICATION

The present application is a continuation of U.S. patent applicationSer. No. 15/846,500, filed Dec. 19, 2017, entitled “METHOD AND SYSTEMFOR REDUCING RISK VALUES DISCREPANCIES BETWEEN CATEGORIES”, the contentsof which are hereby incorporated by reference in its entirety.

BACKGROUND 1. Technical Field

The present teaching generally relates to reducing risk valuediscrepancies between categories. More specifically, the presentteaching relates to reducing risk value discrepancies between categoriesfor use with a predictive model.

2. Technical Background

A new field in machine learning corresponds to adversarial machinelearning. In adversarial machine learning, various machine learningmodels may be vulnerable to misclassification due to minor offsets toinputs used. For example, an input value A may yield score X for apredictive model, where an input using input value B, which is onlyslightly different than input value, may yield score Y, which may bevastly different than score X.

In machine learning, and in particular fraud detection andclick-through-rate (“CTR”) prediction, raw features (e.g., inputs) aremodified into a representation capable of being input into one or morepredictive models. Fraud detection, as described herein, may correspondto Traffic Protection (“TP”), credit card fraud detection, Internetpayment fraud detection, and the like. The modification process iscommonly referred to as “feature encoding.” However, as mentioned above,a drawback to feature encoding is that, for certain features, smallchanges to a value of the feature may cause a substantial change to thepredictive score. This leaves the predictive model vulnerable to adverseparties who may try to exploit this drawback by modifying feature valuesto invalidate a, or identify a suspicious, predictive score.

Thus, there is a need for methods and systems that removes the effectsof minor changes to feature values causing major changes in predictivescores. The present teaching aims to address these issues.

SUMMARY

The teachings disclosed herein relate to methods, systems, andprogramming for reducing risk value discrepancies between categories.More particularly, the present teaching relates to methods, systems, andprogramming related to reducing risk value discrepancies betweencategories for use with a predictive model.

In one example, a method, implemented on a machine having at least oneprocessor, memory, and communications circuitry capable of connecting toa network for removing perturbations from predictive scoring isdescribed. Data representing a plurality of events detected by a contentprovider may be received, the data indicating a time that acorresponding event occurred and whether the corresponding event wasfraudulent. First category data may be generated by grouping each of theplurality of events into one of a number of categories, each categorybeing associated with a range of times. A first measure of a risk foreach category may be determined where the first measure of the riskindicates a likelihood that a future event occurring at a future time isfraudulent. Second category data may be generated by processing thefirst category data, and a second measure of the risk for each categorymay be determined. Measure data representing the second measure of riskfor each category and the range of times associated with that categorymay then be stored.

In a different example, a system for removing perturbations frompredictive scoring is described. The system may include a user eventdetection system, a data bin filling system, a measure of riskdetermination system, and a measure risk database. The user eventdetection system may be configured to receive data representing aplurality of events detected by a content provider, the data indicatinga time that a corresponding event occurred and whether the correspondingevent was fraudulent. The data bin filling system may be configured togenerate first category data by grouping each of the plurality of eventsinto one of a number of categories, each category being associated witha range of times, and generate second category data by processing thefirst category data. The measure of risk determination system may beconfigured to determine a first measure of a risk for each category,where the first measure of the risk indicates a likelihood that a futureevent occurring at a future time is fraudulent, and determine a secondmeasure of the risk for each category. The measure risk database may beconfigured to store measure data representing the second measure of riskfor each category and the range of times associated with that category.

Other concepts relate to software for implementing the present teachingon removing perturbations from predictive scoring. A software product,in accord with this concept, includes at least one machine-readablenon-transitory medium and information and/or instructions storedthereon. The instructions stored on the medium may include executableprogram code data, parameters in association with the executable programcode, and/or information related to a user, a request, content, orinformation related to removing perturbations from predictive scoring,etc.

In one example, a machine-readable, non-transitory and tangible mediumhaving instructions recorded thereon for removing perturbations frompredictive scoring is described. The instructions, when executed by atleast one processor of a computing device, may cause the computingdevice to receive data representing a plurality of events detected by acontent provider, the data indicating a time that a corresponding eventoccurred and whether the corresponding event was fraudulent; generatefirst category data by grouping each of the plurality of events into oneof a number of categories, each category being associated with a rangeof times; determine a first measure of a risk for each category, whereinthe first measure of the risk indicates a likelihood that a future eventoccurring at a future time is fraudulent; generate second category databy processing the first category data; determine a second measure of therisk for each category; and store measure data representing the secondmeasure of risk for each category and the range of times associated withthat category.

Additional novel features will be set forth in part in the descriptionthat follows, and in part will become apparent to those skilled in theart upon examination of the following and the accompanying drawings ormay be learned by production or operation of the examples. The novelfeatures of the present teachings may be realized and attained bypractice or use of various aspects of the methodologies,instrumentalities and combinations set forth in the detailed examplesdiscussed below.

BRIEF DESCRIPTION OF THE DRAWINGS

The methods, systems and/or programming described herein are furtherdescribed in terms of exemplary embodiments. These exemplary embodimentsare described in detail with reference to the drawings. Theseembodiments are non-limiting exemplary embodiments, in which likereference numerals represent similar structures throughout the severalviews of the drawings, and wherein:

FIGS. 1A and 1B are illustrative diagrams of exemplary networkenvironment for evaluating risk and mitigating effects of feature valuemanipulation to effect predictive model scores, in accordance withvarious embodiments of the present teaching;

FIG. 2A is an illustrative diagram of an exemplary risk evaluationsystem, in accordance with various embodiments of the present teaching;

FIG. 2B is an illustrative flowchart of an exemplary process formitigating the effects of feature value manipulation, in accordance withvarious embodiments of the present teaching;

FIG. 3A is an illustrative diagram of an exemplary user event detectionsystem, in accordance with various embodiments of the present teaching;

FIG. 3B is an illustrative flowchart of an exemplary process forgenerating event data, in accordance with various embodiments of thepresent teaching;

FIG. 4A is an illustrative diagram of an exemplary data bin fillingsystem, in accordance with various embodiments of the present teaching;

FIG. 4B is an illustrative diagram of an exemplary data structure forcategory data, in accordance with various embodiments of the presentteaching;

FIG. 4C is an illustrative flowchart of an exemplary process forgenerating bin data, in accordance with various embodiments of thepresent teaching;

FIG. 5A is an illustrative diagram of an exemplary measure of riskdetermination system, in accordance with various embodiments of thepresent teaching;

FIG. 5B is an illustrative flowchart of an exemplary process fordetermining a measure of risk, in accordance with various embodiments ofthe present teachings;

FIG. 6A is an illustrative diagram of an exemplary data processingsystem, in accordance with various embodiments of the present teaching;

FIG. 6B is an illustrative flowchart of an exemplary process fordetermining and applying a smoothing function to bin data, in accordancewith various embodiments of the present teaching;

FIG. 7A is an illustrative diagram of an exemplary event identifier, inaccordance with various embodiments of the present teaching;

FIG. 7B is an illustrative flowchart of an exemplary process fordetermining whether an event is a fraudulent event, in accordance withvarious embodiments of the present teaching;

FIG. 7C is an illustrative graph of two example processed categories, inaccordance with various embodiments of the present teachings;

FIG. 8 is an illustrative diagram of an exemplary mobile devicearchitecture that may be used to realize a specialized systemimplementing the present teaching in accordance with variousembodiments; and

FIG. 9 is an illustrative diagram of an exemplary computing devicearchitecture that may be used to realize a specialized systemimplementing the present teaching in accordance with variousembodiments.

DETAILED DESCRIPTION

In the following detailed description, numerous specific details are setforth by way of examples in order to provide a thorough understanding ofthe relevant teachings. However, it should be apparent to those skilledin the art that the present teachings may be practiced without suchdetails. In other instances, well known methods, procedures, components,and/or circuitry have been described at a relatively high-level, withoutdetail, in order to avoid unnecessarily obscuring aspects of the presentteachings.

The present disclosure generally relates to systems, methods, medium,and other implementations directed to removes the effects of minorchanges to feature values causing major changes in predictive scores.The disclosed teaching on removes the effects of minor changes tofeature values causing major changes in predictive scores includes, butis not limited to, receiving data representing events detected by acontent provider, generating first category data by grouping each eventinto a category, determining a first measure of risk for each category,generating second category data by processing the first category data,determining a second measure of risk for each category, and storingmeasure data representing the second measure of risk.

Predictive scores, as described herein, are values generated by apredictive model, or predictive models. Predictive scores are generatedby inputting feature value into a predictive model. Predictive modelsmay be trained using a large number (e.g., hundreds, thousands, etc.) offeatures, where each feature may reflect a likelihood that of fraudulentactivity.

Typically, feature values are processed to generate values that may beinput into the predictive model(s). The feature values prior toprocessing may be referred to as “raw features,” or “raw featurevalues.” Prior to being used by the predictive model(s), the rawfeatures may go through a process called “feature encoding,” by whichthe raw features are converted into a format useable by the predictivemodel(s). One exemplary feature encoding technique is referred to as a“weight of evidence,” or WOE. WOE may be employed in a wide number ofgenres, however in the present discussion, WOE is particularlydescribing scenarios where fraudulent and non-fraudulent events arereferences.

In some embodiments, WOE may be described by Equation 1:

$\begin{matrix}{{WOE_{i}} = {{\log\left( \frac{P_{i}}{N_{i}} \right)} - {{\log\left( \frac{P}{N} \right)}.}}} & {{Equation}1}\end{matrix}$

In Equation 1, P_(i) and N_(i) correspond to fraudulent events (e.g.,positive examples) and non-fraudulent (e.g., negative examples) in thei-th category in a feature, respectively. P and N correspond tofraudulent events (e.g., positive examples) and non-fraudulent (e.g.,negative examples) in that occurred in a plurality of events beinganalyzed. Generally speaking, Equation 1 reflects a portion associatedwith the odds that a particular category has fraudulent activity, and asecond portion indicating the odds of fraudulent activity generallywithin the plurality of events. For example, if WOE_(i) is positive,then this may indicate that a fraud probability—a likelihood offraudulent activity—of a particular category, including a subset ofevents of the plurality, is greater than the fraud probability of theentire plurality of events. In other words, WOE may measure a relativerisk of an event being fraudulent within one category.

FIGS. 1A and 1B are illustrative diagrams of exemplary networkenvironment for evaluating risk and mitigating effects of feature valuemanipulation to effect predictive model scores, in accordance withvarious embodiments of the present teaching. In FIG. 1A, an exemplarynetworked environment 100 is described. Publishers may earn revenue byproviding and displaying advertisements on websites. Generally speaking,the greater the number of visitors (e.g., traffic) at that website wherethe advertisement is displayed, the greater the revenue for thepublisher. However, dishonest and fraudulent individuals, companies,etc., may use these same principles to collect money under falsepretenses. For example, a fraudulent user, which as described herein maycorrespond to any individual, group of individuals, business, and/orentity, that is attempting to obtain revenue under false pretenses, maycreate websites and/or take over an existing website, simulate traffic,and earn revenue via that traffic. Fraudulent users who create multiplesites, each of which only collects a smaller amount of money, mayfurther compound this problem. This may allow the fraudulent users to gounnoticed, as no one website generates enough money to raise suspicion,however collectively the sites may bring in a larger amount of revenuefor the fraudster.

In order to for fraudulent users to simulate traffic for each websitecreated, the fraudulent users may need certain data. For example, andwithout limitation, user devices, browser cookies, internet protocol(“IP”) addresses, user agent strings, and the like, may be needed inorder to simulate believable traffic. As an illustrative example,multiple browser cookies may be generated by repeatedly extracting abrowser cookie from a web browser's cache file, clearing that browser'sbrowsing history, and browsing again thereby generating a new browsercookie. In certain scenarios, fraudulent users may take those extractedbrowser cookies and place them on additional user devices so thatdifferent devices share one or more same browser cookies. User agentstrings may also be fraudulently created using web automation tools toalter the user agent string. This, for example, may allow a user agentstring that is initially declared as being for one type of operatingsystem to be modified such that it declares itself as being for adifferent type of operating system. While changing/modifying IPaddresses is slightly more difficult, fraudulent users may employ IPbotnets or cloud servers to acquire IP addresses, which may even beshared amongst fraudulent users across multiple websites.

FIG. 1A is an illustrative diagram of an exemplary networked environmentfor detecting one or more fraudulent events, in accordance with variousembodiments of the present teaching. In FIG. 1A, an exemplary networkedenvironment 100 includes may include one or more user devices 110, acontent provider 130, one or more content sources 160, and a riskevaluation system 140, each of which may be capable of communicatingwith one another via one or more networks 120. Network(s) 120, in someembodiments, may be a single network or a combination of differentnetworks. For example, network(s) 120 may be a local area network(“LAN”), a wide area network (“WAN”), a public network, a privatenetwork, a proprietary network, a Public Telephone Switched Network(“PSTN”), the Internet, an intranet, a wireless network, a virtualnetwork, and/or any combination thereof. In one embodiment, network(s)120 may also include various network access points. For example,networked environment 100 may include wired or wireless access pointssuch as, and without limitation, base stations or Internet exchangepoints 120-a . . . 120-b. Base stations 120-a, 120-b may facilitate, forexample, communications to/from user devices 110 with one or more othercomponents of networked environment 100 across network(s) 120.

User devices 110 may be of different types to facilitate one or moreusers operating user devices 110 to connect to network(s) 120. Userdevices 110 may correspond to any suitable type of electronic deviceincluding, but not limited to, desktop computers 110-d, mobile devices110-c (e.g., mobile phones, smart phones, personal display devices,personal digital assistants (“PDAs”), gaming consoles/devices, wearabledevices (e.g., watches, pins/broaches, headphones, etc.), transportationdevices 110-b (e.g., cars, trucks, motorcycles, boats, ships, trains,airplanes), mobile computers 110-a (e.g., laptops, ultrabooks), smartdevices (e.g., televisions, set top boxes, smart televisions), smarthousehold devices (e.g., refrigerators, microwaves, etc.), and/or smartaccessories (e.g., light bulbs, light switches, electrical switches,etc.). A user (e.g., an individual or individuals), in one embodiment,may send data (e.g., a request) and/or receive data (e.g., content) viauser devices 110.

Content sources 160 may include one or more content providers 160-a,160-b, and 160-c, in some embodiments. Although three content sourcesare shown within environment 100, any number of content providers may beincluded. Content sources 160 may correspond to any suitable contentsource, such as, and without limitation, an individual, a business, anorganization, and the like, which may be referred to herein collectivelyas an “entity” or “entities.” For example, content sources 160 maycorrespond to a government website, a news site, a social media website,and/or a content feed source (e.g., a blog). In some embodiments,content sources 160 may be vertical content sources. Each content source160 is configured to generate and send content to one or more of userdevices 110 via network(s) 120. The content (e.g., a webpage) mayinclude information consumable by a user via their user device 120, forinstance, as well as one or more advertisements or other information.

Content provider(s) 130 may correspond to one or more content providerssuch as, and without limitation, a publisher or publishers that publishcontent and/or advertisements. For example, content provider(s) 130 maybe configured to present content obtained from one or more of contentsources 160. In some embodiments, content providers 130 may present oneor more advertisements thereon, which may be selected from anadvertisement database, an advertisement source, and/or any othersuitable entity (e.g., content source 160). In some embodiments, contentprovider(s) 130 is/are configured to provide product(s) and/orservice(s), and may be configured to handle the advertising process forits own product(s) and/or a service (e.g., websites, mobileapplications, etc.) related to advertising, or a combination thereof.For example, content providers 130 may include such systems as anadvertising agency or a dealer of advertisement that operates a platformthat connects an advertiser or advertising agency one or more additionalentities.

Risk evaluation system 140, as described in greater detail below, may beconfigured to measure an amount of risk associated with one or more useractivities, and may further modify feature values to mitigate theeffects associated with slight perturbations in feature values, and theeffect that those perturbation values may incur in outputs from apredictive model. In some embodiments, risk evaluation system 140 mayobtain user interaction/user activity data from a user interactiondatabase 150. The user interaction data may represent a plurality ofuser events that have occurred on/detected by content provider 130. Forexample, user activity data associated with a webpage hosted by acontent provider 130 may be stored by user interaction database 150. Theinteraction data may be associated with a certain temporal duration. Forexample, user activity associated with a day's worth of traffic at awebpage may be stored by user interaction database 150. In someembodiments, the data may indicate a time that a corresponding event ofthe plurality of events occurred, as well as an indication as to whetherthat event was classified as being fraudulent.

Risk evaluation system 140 may further be configured, in someembodiments, to generate category data by grouping, also referred toherein as binning or segmenting, each of the plurality of events intoone or more categories or bins. Each category may be associated with adistinct range of times. For example, category data may be generatedincluding twenty-four categories or bins, each associated with aone-hour time interval across a day. Events that occurred during aparticular time interval may be grouped into a corresponding category.As described herein, a bin or category may be used interchangeably.

A measure of risk may then be determined for each category. The measureof risk may indicate a likelihood that a future event that occurs duringa same time interval is fraudulent. For example, if events occurringduring the times 12:00-1:00 AM have a measure of risk of 0.79, then thismay indicate that a future event that occurs between the hours of 12:00AM and 1:00 AM, based on substantially the same input conditions, has a79% chance of being a fraudulent act.

In some embodiments, based on the measure of risks for each category,the category data may be processed to generate new/modified categorydata. The processing, in one embodiment, may correspond to applying asmoothing function to the category data to reduce any sharp transitionsbetween measure of risk values between adjacent categories. After theprocessing (e.g., applying the smoothing function), measures of risk maybe determined again for each category, and these values may be storedwithin measure risk database 170 for future use by one or morepredictive models. In this way, when a future event occurs, the storedmeasure of risk values may be employed to determine a likelihood thatthe future event is fraudulent.

Networked environment 100 of FIG. 1B, in one illustrative embodiment,may be substantially similar to networked environment 100 of FIG. 1A,with the exception that risk evaluation system 140 may serve as abackend for content provider(s) 130. Alternatively, content provider(s)130 may serve as a backend for risk evaluation system 140.

FIG. 2A is an illustrative diagram of an exemplary risk evaluationsystem, in accordance with various embodiments of the present teaching.In the non-limiting embodiment, risk evaluation system 140 may include auser event detection system 210, a data bin filling system 212, ameasure of risk determination system 214, and a data processing system216.

In some embodiments, risk evaluation system 140 may include one or moreprocessors, memory, and communications circuitry. The one or moreprocessors of risk evaluation system 140 may include any suitableprocessing circuitry capable of controlling operations and functionalityof one or more components/modules of risk evaluation system 140, as wellas facilitating communications between various components within riskevaluation system 140 and/or with one or more other systems/componentsof network environments 100, 150. In some embodiments, the processor(s)may include a central processing unit (“CPU”), a graphic processing unit(“GPU”), one or more microprocessors, a digital signal processor, or anyother type of processor, or any combination thereof In some embodiments,the functionality of the processor(s) may be performed by one or morehardware logic components including, but not limited to,field-programmable gate arrays (“FPGA”), application specific integratedcircuits (“ASICs”), application-specific standard products (“ASSPs”),system-on-chip systems (“SOCs”), and/or complex programmable logicdevices (“CPLDs”). Furthermore, each of processor(s) 402 may include itsown local memory, which may store program systems, program data, and/orone or more operating systems. However, the processor(s) may run anoperating system (“OS”) for one or more components of risk evaluationsystem 140 and/or one or more firmware applications, media applications,and/or applications resident thereon. In some embodiments, theprocessor(s) may run a local client script for reading and renderingcontent received from one or more websites. For example, theprocessor(s) may run a local JavaScript client for rendering HTML orXHTML content received from a particular URL accessed by user device(s)110.

The memory may include one or more types of storage mediums such as anyvolatile or non-volatile memory, or any removable or non-removablememory implemented in any suitable manner to store data for riskevaluation system 140. For example, information may be stored usingcomputer-readable instructions, data structures, and/or program systems.Various types of storage/memory may include, but are not limited to,hard drives, solid state drives, flash memory, permanent memory (e.g.,ROM), electronically erasable programmable read-only memory (“EEPROM”),CD-ROM, digital versatile disk (“DVD”) or other optical storage medium,magnetic cassettes, magnetic tape, magnetic disk storage or othermagnetic storage devices, RAID storage systems, or any other storagetype, or any combination thereof. Furthermore, the memory may beimplemented as computer-readable storage media (“CRSM”), which may beany available physical media accessible by the processor(s) to executeone or more instructions stored within the memory.

The communications circuitry may include any circuitry allowing orenabling one or more components of risk evaluation system 140 tocommunicate with one another, and/or with one or more additionaldevices, servers, and/or systems. In some embodiments, communicationsbetween one or more components of risk evaluation system 140 may becommunicated with user devices 110, content sources 160, contentprovider(s) 130, etc., via the communications circuitry. For example,network(s) 120 may be accessed using Transfer Control Protocol andInternet Protocol (“TCP/IP”) (e.g., any of the protocols used in each ofthe TCP/IP layers), Hypertext Transfer Protocol (“HTTP”), WebRTC, SIP,and/or wireless application protocol (“WAP”). Various additionalcommunication protocols may be used to facilitate communications betweenvarious components of risk evaluation system 140, including, but notlimited to, Wi-Fi (e.g., 802.11 protocol), Bluetooth, radio frequencysystems (e.g., 800 MHz, 1.4 GHz, and 5.6 GHz communication systems),cellular networks (e.g., GSM, AMPS, GPRS, CDMA, EV-DO, EDGE, 3GSM, DECT,IS 136/TDMA, iDen, LTE or any other suitable cellular network protocol),infrared, BitTorrent, FTP, RTP, RTSP, SSH, and/or VOIP.

The communications circuitry may use any communications protocol, suchas any of the previously mentioned exemplary communications protocols.In some embodiments, one or more components of risk evaluation system140 may include one or more antennas to facilitate wirelesscommunications with a network using various wireless technologies (e.g.,Wi-Fi, Bluetooth, radiofrequency, etc.). In yet another embodiment, oneor more components of user activity detection system may include one ormore universal serial bus (“USB”) ports, one or more Ethernet orbroadband ports, and/or any other type of hardwire access port so thatthe communications circuitry facilitates communications with one or morecommunications networks.

User event detection system 210, in the illustrative embodiment, may beconfigured to receive user interaction data from user interactiondatabase 150. The interaction data may be retrieved periodically (e.g.,hourly, daily, weekly, etc.), as well as, or alternatively, in responseto a request received by risk evaluation system 140. The interactiondata may represent event data corresponding to a plurality of eventsthat occurred at, or detected by, a content provider 130 during acertain temporal duration. For example, the interaction data mayindicating the various user interactions detected by a webpage for acertain time period. Each event included within the interaction data mayinclude, amongst other aspects, temporal metadata and fraudulentmetadata. The temporal metadata, for instance, may indicate a time thata corresponding event occurred. The fraudulent metadata, for instance,may indicate whether that particular event is fraudulent ornon-fraudulent. In some embodiments, the fraudulent metadata may bedetermined beforehand, for instance, by a fraudulent user activitydetection system.

Data bin filling system 212, which may also be referred to as datacategory filling system 212, may be configured, in one embodiment, togroup events from the interaction data into a correspondingcategory/bin. For example, data bin filling system 212 may define,determine, or obtain, a number of categories with which the interactiondata is to be binned into, and may generate the number of categories.Data bin filling system 212 may further be configured to analyze eachevent, and may determine, based at least in part on the temporalmetadata, a category with which each event is to be grouped into. Forexample, events occurring during the times 1:01 PM and 2:00 PM may beplaced in a first category, events occurring during the times 2:01 PMand 3:00 PM may be placed in a second category, and so on.

Measure of risk determination system 214, in some embodiments, may beconfigured to determine a measure of risk associated with the eventsincluded within each category. The measure of risk, as described herein,may correspond to a likelihood that a future event associated with aparticular category will be fraudulent. For example, if the measure ofrisk for a first category is 0.79, and the first category is associatedwith events occurring within the time interval 1:01 PM and 2:00 PM, thena future event, occurring at time 1:30 PM, may have a 79% chance ofbeing fraudulent. In some embodiments, measure of risk determinationsystem 214 may be configured to determine whether a difference between ameasure of risk between two adjacent categories is greater than athreshold, and if so, may process the category data to generate newcategory data that smooths the transitions between the categories. Forexample, if the measure of risk for a first category, associated withthe times 1:01 PM to 2:00 PM, is 0.79, while the measure of risk for asecond category, associated with the times 2:01 PM to 3:00 PM, is 0.35,then it may be easy for a fraudulent user to circumvent the system byperforming actions during the second category's times as oppose to thefirst category's times to reduce the likelihood of the fraudulentactivity being detected. Accordingly, as described in greater detailbelow, a smoothing function may be applied to the first category data toreduce the difference between the measure of risk of adjacentcategories. In some embodiments, measure data representing the measureof risk for each category may be stored within measure risk database170.

Data processing system 216 may be configured to determine whether asmoothing function is needed to be applied to the category data, and mayapply the appropriate smoothing function. Continuing the previousexample, the difference between the measures of risk for times 1:01-2:00PM and 2:01-3:00 PM may be 0.39. If the threshold transition value isset at 0.1, then data processing system 216 may determine that asmoothing function may be needed to be applied to the category data toreduce the discrepancy between the two categories' measure of risks. Asan illustrative example, application of the smoothing function may causethe measures of risk for the two time intervals to change from 0.79 and0.39, to 0.65 and 0.56.

FIG. 2B is an illustrative flowchart of an exemplary process formitigating the effects of feature value manipulation, in accordance withvarious embodiments of the present teaching. In the illustrative,non-limiting, embodiment, a process 250 may begin at step 252. At step252, data representing events detected by a content provider may bereceived. For example, content provider 130 (e.g., a webpage) may detectuser activities/user interactions therewith. These interactions may belogged and stored by user interaction database 150. The userinteractions may, in one embodiment, correspond to events that occurredduring a particular temporal duration (e.g., over the course of one day,or 24-hours). In some embodiments, user event detection system 210 maybe configured to receive data representing these events, where the dataindicates a time that the event occurred (e.g., via temporal metadata)as well as whether that event corresponds to a fraudulent action (e.g.,fraudulent metadata).

At step 254, first category data may be generated. For instance, databin filling system 212 may be configured to receive the datarepresenting the events from user event detection system 210, and maygroup the events into a category. In some embodiments, the grouping ofevents into one of a number of categories may be based on the temporalmetadata. For example, data bin filling system 212 may initializetwenty-four (24) data bins, each corresponding to a category (e.g., aone-hour temporal interval), and may group the events into one of thecategories. In some embodiments, the categories may be non-overlapping.

At step 256, a first measure of risk may be determined. In someembodiments, the first measure of risk may be determined for eachcategory. For example, a measure of risk for each one-hour time intervalmay be determined. In one embodiment, the first measure of risk may bedetermined using WOE, as described above with relation to Equation 1.Measure of risk determination system 214, in the illustrativeembodiment, may be configured to determine the measure of risk for eachcategory.

At step 258, second category data may be generated. In some embodiments,data processing system 216 may determine that a smoothing function isneeded to be applied to the category data to reduce transitions betweenmeasure of risk values for adjacent categories. In this particularscenario, data processing system 216 may process the category data byapplying the smoothing function. In response, the smoothed data may beprovided back to data bin filling system 212, which in turn may generatesecond category data including the events distributed into theappropriate categories.

At step 260, a second measure of risk may be determined. The secondmeasure of risk may be determined, in some embodiments, in asubstantially similar manner as the first measure of risk, however inthis particular instance, the second category data may be employed(e.g., post-processing). In one embodiment, measure of riskdetermination system 214 may determine the second measure of risk usingthe second category data.

At step 262, measure data representing the second measure of risk may bestored. For example, the measure data may be stored in measure riskdatabase 170. In some embodiments, the measure risk data may include themeasure of risk score for each category, and temporal informationindicating an associated temporal interval of the correspondingcategory.

FIG. 3A is an illustrative diagram of an exemplary user event detectionsystem, in accordance with various embodiments of the present teaching.User event detection system 210, in a non-limiting embodiment, mayinclude an event data receiver 310, a fraudulent flag metadataidentifier 312, and a temporal metadata extractor 314. In someembodiments, user event detection system 210 may include a timer 316 anda new data collection initiation 318.

Event data receiver 310, in one embodiment, may be configured to receiveuser interaction data from user interaction database 150. Event datareceiver 310 may receive user interaction data periodically and/orcontinually. In the latter scenario (e.g., continually), event datareceiver 310 may be configured to collect interaction data associatedwith certain temporal durations prior to being sent to fraudulent flagmetadata identifier 312.

Fraudulent flag metadata identifier 312 may be configured to analyze theuser interaction data received from event data receiver 310 anddetermine which events are fraudulent, and which events are notfraudulent. In some embodiments, the user interaction data associatedwith each event may include fraudulent metadata that indicates whetherthat particular event represents a fraudulent event. In one embodiment,the fraudulent metadata may be a binary indicator. For instance, dataassociated with a particular event may include a data flag set to 1(e.g., a logical 1, TRUE) that indicates that the associated event is afraudulent event, or that the event may include a data flag sent to 0,(e.g., a logical 0, FALSE) that indicates that the associated event is anon-fraudulent event. In some embodiments, fraudulent flag metadataidentifier 312 may be further configured to parse the user interactiondata such that only data associated with events where the fraudulentmetadata indicates a fraudulent event. In yet further embodiments,fraudulent flag metadata identifier 312 may parse the interaction datainto two groups, data associated with fraudulent events (e.g., flag setto 1) and data associated with non-fraudulent events (e.g., flag set to0).

Temporal metadata extractor 314, in one embodiment, may be configured toextract the temporal metadata from the user interaction data. Dataassociated with each event may include temporal metadata indicating atime that an associated event occurred. Temporal metadata extractor 314may, therefore, reveal a time that the corresponding event occurred,which may then be used to generate event data employed by data binfilling system 212 to group events into categories. In some embodiments,temporal metadata extractor 314 may log each event that will be outputwith the event data into an event log 320. Event log 320, for example,may track and store each event, fraudulent as well as, in some instance,non-fraudulent, and a corresponding time that the particular eventoccurred.

User event detection system 210 may include timer 316, which may beconfigured to track an amount of time between when user interaction datais sent to event data receiver 310. In some embodiments, timer 316 maybegin to its timer in response to a new data collection initiationsignal being sent from new data collection initiator 318 to userinteraction database 150. Timer 316 may continue to monitor an amount oftime that elapses since the initiation signal was sent until the amountof time reaches and/or exceeds a temporal threshold. For example, thetemporal threshold may be a few seconds, minutes, one or more hours, oneor more days, and the like. After timer 316 determines that the amountof time has reached and/or exceeded the temporal threshold, timer 318may notify new data collection initiator 318, which in turn may generateand send an instruction to user interaction database 150 to send userinteraction data associated with a particular content provider 130 toevent data receiver 310. The instruction may further indicate a temporalduration with which the user interaction data should encompass. Forexample, the instruction may indicate that user interaction database 150is to send user interaction data for all user events that occurred sincea last instance of the instruction was sent from new data collectioninitiator 318 to user interaction database 150.

FIG. 3B is an illustrative flowchart of an exemplary process forgenerating event data, in accordance with various embodiments of thepresent teaching. Process 350, in a non-limiting embodiment, may beginat step 352. At step 352, an instruction may be received to obtain userinteraction data. For instance, in response to timer 316 determiningthat a temporal threshold has been met, new data collection initiator318 may generate an instruction, and may send that instruction to userinteraction database 150.

At step 354, user interaction data may be obtained. For instance, eventdata receiver 310 may receive user interaction data from userinteraction database 150. In some embodiments, the user interaction datamay be associated with an amount of time that has elapsed since a lastinstruction to obtain user interaction data was received by userinteraction database 150.

At step 356, temporal metadata may be extracted from the userinteraction data. The temporal metadata may indicate a time that eachevent included within the user interaction data occurred. In someembodiments, temporal metadata extractor 314 may extract the temporalmetadata associated with each event represented by the user interactiondata. At step 358, fraudulent flag metadata may be extracted. Fraudulentflag metadata may indicate whether a corresponding event is classifiedas being fraudulent. In some embodiments, fraudulent flag metadataidentifier 312 may identify whether a particular event is fraudulentbased on the fraudulent metadata associated with each event.

At step 360, a log of each event may be stored. For example, each event,as well as whether that event was fraudulent or non-fraudulent, and atime that the particular event occurred, may be logged by event log 320.At step 362, event data may be output. The event data may represent aplurality of events that were detected by a content provider, and mayinclude the temporal metadata (e.g., a time that each of the eventsoccurred) and fraudulent metadata (e.g., whether each event isfraudulent or non-fraudulent). The event data, for example, may beprovided to data bin filling system 212 from user event detection system210.

Persons of ordinary skill in the art will recognize that, in someembodiments, user event detection system 210 may analyze the userinteraction data beforehand. For instance, the user interaction data maybe processed to determine event data representing a plurality of eventsthat were detected by a content provider 130 during a particulartemporal duration offline. In this particular scenario, the event datamay further be generated to include the temporal metadata and thefraudulent metadata, such that the extraction of the temporal metadataand the fraudulent metadata for each event may further occur offline. Instill yet another embodiment, user event detection system 210, or aninstance of user event detection system 210, may be in communicationwith risk evaluation system 140, as opposed to being a sub-system ofrisk evaluation system 140.

FIG. 4A is an illustrative diagram of an exemplary data bin fillingsystem, in accordance with various embodiments of the present teaching.Data bin filling system 212, in the illustrative embodiment, includes acategory type identifier 410, a number of categories determiner 412, adata category setup unit 414, an event-category assigner 416, and acategory data generator 418.

Category type identifier 410 may be configured to determine a type ofcategory that is to be analyzed. The type of category, for instance, maybe associated with the predictive model or models being employed, aswell as the particular feature of features being input to the predictivemodel. In some embodiments, category type identifier 410 may identifythe type of category based on the event data received. For example, ifthe event data represents a plurality of events, each associated withtemporal metadata and fraudulent data, then category type identifier 410may determine that the type of category that the data is associated witha feature “hour of the day,” corresponding to the category “hours.” Thefeature “hour of the day,” in one embodiment, may correspond tocategories that are associated with each hour interval during the courseof a day. In some embodiments, category type identifier 410 may accesscategories 420.

Categories 420 may correspond to various types of categories associatedwith various features that may be input to the predictive model(s). Forinstance, various other features may also be employed including, but notlimited to, browser cookie (“bcookie) age, distance from home, visits ina past X days—where X=1/24, ½, 1, 7, etc., number of clicks for anentity in a past X days, average click-through-rate (“CTR”) for anentity in a past X days, average fraud in an entity in a past X days,ratio of IP addresses over user agents on a website, and the like. As anillustrative example, the feature “cookie age” may correspond to adifferent between a current time that an event occurs that stems from abrowser cookie (e.g., a click on a content item) and a time when thatsame browser cookie last logged into the system. For this particular“cookie age” feature, categories may be (0, 1], (1, 3], (3, 7], (7, 15],etc. For instance, for these categories, the values may relate to anamount of time that has elapsed, in days, between a current click eventfrom a browser cookie and the last time that that browser cookie loggedinto the system (e.g., system 100, 150). As another illustrativeexample, the feature “distance from home,” may correspond to a distancebetween a current IP address location of a user device associated with auser interaction and a “home” IP address associated with a mostfrequently used IP address of that user device. In some embodiments, thedistance may calculated in terms of “miles,” however other units ofdistance (e.g., feet, meters, kilometers, etc.) may be used. For the“distance from home” feature, categories may correspond to (0, 10], (10,20], (20, 50], etc.

Number of categories determiner 412 may be configured to determine anumber of categories that may be employed based on the category type.For example, if the category corresponds to hour time intervalsassociated with the feature “hours of the day,” then the number ofcategories may be determined to 24 categories. Each category of thetwenty-four categories corresponds to a single one-hour temporalinterval during the course of the day. For instances, a first categorymay correspond to the range of times beginning at a time 0:00 and endingat a time 0:59, a second category may correspond to a range of timesbeginning at 1:00 and ending at a time 1:59, and so on. Persons ofordinary skill in the art will recognize that different number ofcategories may also be employed for similar features. For example, ifthe feature corresponds to minutes of the day, then the category may beminutes, and there may be 1440 categories (e.g., first category fromtime 00:00:00 to time 00:00:59; a second category may be from time00:01:00 to time 00:01:59, etc.).

Data category setup unit 414 may be configured to generate the datacategories. For example, data category setup unit 414 may allocateportions of memory associated with risk evaluation system 140 for eachof the number of categories. The allocated portions may be structuredsuch that data representing each data event is stored within memory andassociated with other event having a same parameter. Category range(s)422 may be employed to determine an appropriate range of units for eachcategory setup by data category setup unit 414. For example, categoryrange(s) 422 may indicate that, for the category “hours,” associatedwith the feature “hours of the day,” the category ranges should beassociated with 60-minute temporal intervals.

In some embodiments, data category setup unit 414 may employ a Chi-Mergediscretization algorithm to determine an optimal range of values foreach category. Category size(s) 424 may, in one embodiment, specify aparticular amount of memory needed to be allocated for each event and/orrestrictions associated with the amount of memory being allocated. Forexample, X bits may be allocated within the memory, which may have X/Nbits available for storing event data falling within a specified rangeof the category range 420. In the aforementioned example, N correspondsto the number of categories.

In a Chi-Merge discretization technique, each category is continuouslymerged until a termination condition is met. The Chi-Merge technique maybegin by first sorting training examples according to their featurevalues being discretized. Next, the categories may be constructed usingthe initial discretization such that each example is assigned to its owncategory. A Chi-Square test may then be performed for every pair ofadjacent categories (e.g., category(i−1) and category(i), category(i)and category(i+1), etc.), with the lowest chi-square value being mergedtogether. The merging may recursively be performed until all pairs ofadjacent categories have Chi-Squared values exceeding a threshold value.For example, the threshold may be determined using a P-value test, ormay be a constant (e.g., based on number of categories, a minimum numberof examples in a category, etc.).

Although Chi-Merge discretization is one example of a technique to “bin”data into different categories, various other types of “binningmechanisms” may also be employed. For instance, an “equal-width binning”technique and an “equal-population binning” technique may alternative beemployed. For equal-width binning, a range of observed values for aparticular feature are grouped together in k categories, each having anequal range. In this particular scenario, the parameter k may be set byan individual using user device 110. As an illustrative example, if afeature has a maximum feature value of x_(max) and a minimum featurevalue of x_(min), then the range employed for this feature may beδ=(x_(max)−x_(min))/k, and each category may be constructed such thatboundaries for that category are defined at x_(min)+iδ, where i=1, 2, .. . , k−1. Equal-population binning techniques may correspond togrouping feature values of a continuous feature into k categories where,given n examples, each category includes n/k (possibly duplicate)adjacent values.

Event-category assigner 416 may be configured, in some embodiments, toassign each event represented by the event data to a particularcategory. For instance, event data may be segmented into categoriesbased on the temporal metadata associated with that event. For example,data category setup unit 414 initiated each category's data structure,and event-category assigner 416 may assign each event to an appropriatecategory's data structure. If the categories are temporal categories,for example, each associated with a particular temporal interval (e.g.,24 one-hour time interval categories), and then the events may be placedin one category based on the temporal metadata, which indicates a timethat the corresponding event occurred.

Category data generator 418 may, in some embodiments, generate thecategory data to be output to measure of risk determination system 214.Category data generator 418 may be configured to determine whenevent-category assigner has completed the assignment of each eventincluded within the event data to a corresponding category datastructure. In response, category data generator 418 may finalize thedata structures, and may output the category data representing theplurality of data structures including the plurality of events, eachhaving a corresponding event identifier, fraudulent identifier, andtimestamp.

FIG. 4B is an illustrative diagram of an exemplary data structure forcategory data, in accordance with various embodiments of the presentteaching. In some embodiments, the allocated memory may be configured ina data structure 440 (or structures) such that an entry in datastructure 440 is associated with each event. Data category setup unit414, for instance, may generate data structures, such as data structures440, for storing and logging event data and transforming that event datainto associated category data, which may then be used for determining ameasure of risk associated with each category.

Each entry may include three or more columns, such as event identifiercolumn 442, fraudulent flag column 444, and timestamp column 446. Forexample, an event may be assigned to a row of the data structure, havingan instance identifier associated with that event stored within eventidentifier column 442 (e.g., the first event having an identifier 0,second event having an identifier 1, . . . L-th event having anidentifier L, etc.). The event may have a fraudulent metadata indicatorstored within fraudulent flag column 444, indicating whether thecorresponding event of that row (e.g., an entry) is fraudulent ornon-fraudulent (e.g., logical 0 for non-fraudulent and logical 1 forfraudulent). The event may further have a timestamp stored withintimestamp column 446 indicating a time associated with that row's event,indicating when that event occurred. The format of the timestamp mayvary depending on the system, user, etc.

In some embodiments, each category of the number of categories mayinclude a separate data structure 440. For example, for each categoryassociated with a single hour temporal interval over the course of a24-hour period may include a separate data structure 440 including rowsrepresenting events that occurred within that category's correspondinghour temporal interval, and columns 442, 444, and 446, indicating eachevent's event identifier, fraudulent identifier, and timestamp. Theoutput category data may, in some embodiments, include each datastructure 440 associated with each category for a particular featurebeing analyzed.

FIG. 4C is an illustrative flowchart of an exemplary process forgenerating bin data, in accordance with various embodiments of thepresent teaching. Process 450, in a non-limiting embodiment, may beginat step 452. At step 452, event data may be received. For instance,category type identifier 410 may receive event data output by user eventdetection system 210. At step 454, a category type to be analyzed may beidentified. In one embodiment, category type identifier 410 may accesscategories 420 to determine a type of category that the event data isassociated with. At step 456, a number of categories associated with theidentifier category type may be determined. For instance, number ofcategories determiner 412 may determine, based on the type of categoryidentifier by category type identifier 410, a number of categories to beemployed. For example, for the feature “hours of the day,” the categorymay correspond to “hours,” and number of categories determiner 412 maydetermine that twenty-four categories are to be used.

At step 458, data categories may be set up. For example, as describedabove, data category setup unit 414 may allocate portions of memoryhaving a particular category size 424 and being associated with aparticular category range 422 for data structures to be populated basedon the event data. At step 460, events may be assigned to a datacategory based on temporal metadata included with the event data andassociated with each event. For example, event-category assigner 416 maybe configured to assign each event to a particular category's datastructure based on the category's temporal range and each event'stemporal metadata. At step 462, category data may be generated. Forinstance, category data generator 418 may generate the category datarepresenting the category data structures populated by the event data.At step 464, the category data may be output. For instance, the categorydata may be output to measure of risk determination system 214.

FIG. 5A is an illustrative diagram of an exemplary measure of riskdetermination system, in accordance with various embodiments of thepresent teaching. Measure of risk determination system 214, in theillustrative embodiment, may include a fraudulent event identifier 510,a non-fraudulent event identifier 512, a measure of risk calculator 514,a temporal metadata combiner 516, and a measure data generator 518. Insome embodiments, fraudulent event identifier 510 and non-fraudulentevent identifier 512 may form an event identifier system 522, which isdescribed in greater detail below.

Event identifier 522 may, in one embodiment, receive the category dataoutput by data bin filling system 212. Upon receipt, event identifiersystem 522 may provide the category data to fraudulent event identifier510 and non-fraudulent event identifier 512. In some embodiments, thecategory data may be provided to fraudulent event identifier 510 andnon-fraudulent event identifier 512 in parallel, however persons ofordinary skill in the art will recognize that this is merely exemplary.Fraudulent event identifier 510 may be configured to identify the eventsincluded within the category data that include a fraudulent metadataflag indicating that the corresponding event is identified as beingfraudulent. Non-fraudulent event identifier 512 may be configured toidentify the events included within the category data that include afraudulent metadata flag indicating that the corresponding event isidentified as being non-fraudulent. For example, a row in data structure440 may have, for fraudulent identifier column 444, a logical 1,indicating that the corresponding event associated with that row hasbeen classified as being fraudulent. As another example, a row in datastructure having a logical 0 in fraudulent identifier column 444 mayindicate that the corresponding event associated with this row has beenclassified as being non-fraudulent.

Persons of ordinary skill in the art will recognize that theclassification of an event as being fraudulent or non-fraudulent may bedone prior to analysis by risk evaluation system 140, in someembodiments. However, in other embodiments, risk evaluation system 140may perform the analysis of fraudulent/non-fraudulent for each event aswell. The basis for analyzing fraudulence for each event may, in someembodiments, be based on a particular entity associated with the event(e.g., a user device 110 that interacted with content provider 130),content provider 130 with which the user interaction data is associated,and the like. For example, the user interaction data may indicate aclick-through-rate (“CTR”) and/or a time-to-click (“TTC”) for aparticular event or series of events, and may determine whether thecorresponding event(s) is/are fraudulent based on the CTR and/or TTC forthat event in relation to one or more threshold CTR and/or TTC values.

Measure of risk calculator 514, in some embodiments, may be configuredto determine an amount of risk associated with each category. In someembodiments, measure of risk calculator 514 may employ risk measuremodels 520 to determine the measure of risk associated with eachcategory. For example, WOE, as described above with relation to Equation1, may be employed as a risk measure model 520 and used by measure ofrisk calculator 514 to determine a measure of risk for one or morecategories. In one embodiment, measure of risk calculator 514 maygenerate a measure of risk for each category, and may output the measureof risk associated with each category to temporal metadata combiner 516.In some embodiments, measure of risk calculator 514 may be configured toadd an additional column to data structure 440, which may include themeasure of risk value generated for a corresponding category representedby that data structure 440. Alternatively or additionally, each datastructure 440 may have a metadata addendum attached thereto thatincludes the calculated measure of risk for that category.

Temporal metadata combiner 516, in one embodiment, may be configured tocombine the measure of risk calculated for each category by measure ofrisk calculator 514 with a corresponding temporal range associated withthat category. For example, if a first category is determined to have ameasure of risk value of 0.60, and the first category is associated atemporal interval of times t1-t2, then temporal metadata combiner 516may combine these two pieces of information together.

Measure data generator 518, in one embodiment, may be configured togenerate measure data representing the measure of risk for each categoryand the range of times associated with that category. In someembodiments, measure data generator 518 may be configured to generate anew data structure including the plurality of temporal ranges associatedwith the number of categories, and the measure of risk values associatedwith each one of those categories. For example, measure data generator518 may generate a data structure including rows, where each rowrepresents a corresponding category of the number of categories (hencethe number of rows corresponding to the number of categories).Furthermore, the data structure may include two or more columns, whereone column may indicate the temporal range associated with thatcategory, and another column may indicate the measure of risk valueassociated with that category. Measure data generator 518 may thereforebe configured to output the measure data to measure risk database 170for storage. Furthermore, in some embodiments, measure data generator518 may be configured to output the measure data to data processingsystem 216.

As described in greater detail below, in some embodiments, measure ofrisk determination system 214 may be configured to determine a newmeasure of risk for processed category data in response to dataprocessing system 216 processing the category data and/or one or moreconditions being met.

FIG. 5B is an illustrative flowchart of an exemplary process fordetermining a measure of risk, in accordance with various embodiments ofthe present teachings. In a non-limiting embodiment, process 550 maybegin at step 552. At step 552, category data may be received. Forinstance, category data provided by data bin filling system 212 may bereceived by event identifier system 522. At step 554, fraudulent eventsmay be identified. For example, fraudulent event identifier 510 mayidentify the fraudulent events included within the category data basedon fraudulent identifiers included within fraudulent identifier column444 of data structure(s) 440. At step 556, non-fraudulent events may beidentified. For example, non-fraudulent event identifier 512 mayidentify the non-fraudulent events included within the category databased on fraudulent identifiers included within fraudulent identifiercolumn 444 of data structure(s) 440.

At step 558, a measure of risk value may be calculated. The measure ofrisk value may be calculated for each category, in some embodiments. Inone embodiment, measure of risk calculator 514 may calculate the measureof risk for each category based, at least in part, on the category dataand the analysis of fraudulent events by fraudulent event identifier 510and non-fraudulent event identifier 512.

At step 560, measure of risk values may be combined with temporalmetadata. For instance, temporal metadata combiner 516 may be configuredto combine the measure of risk values for each category with thetemporal ranges associated with the corresponding category. At step 562,measure data may be generated. For instance, as described above, measuredata generator 518 may generate measure data indicating the measure ofrisk value for each category and the corresponding temporal range. Insome embodiments, measure data generator 518 may be configured togenerate a new or modified data structure including the measure of riskvalue for each category associated with the feature being analyzed, aswell as the temporal range (e.g., range of times) associated with thatcategory. At step 564, the measure data may be output. In someembodiments, the measure data may be output to measure risk database 170for storage. Additionally, or alternatively, the measure data may beoutput to data processing system 216 for further processing.

FIG. 6A is an illustrative diagram of an exemplary data processingsystem, in accordance with various embodiments of the present teaching.Data processing system 216, in the illustrative embodiment, may includea category risk value identifier 610, a transition determination unit612, a smoothing function usage determiner 614, and a smoothed datacategory generator 616.

Category risk value identifier 610 may be configured, in someembodiments, to receive category data from measure risk determinationsystem 214. Upon receiving the category data, category risk valueidentifier 610 may be configured to extract, for each category, acorresponding measure of risk value. For example, for the category“hours” associated with the feature “hours in the day,” category riskvalue identifier 610 may extract 24 measure of risk values, eachassociated with one of the 24 categories.

Transition determination unit 612 may, in some embodiments, beconfigured to determine a difference in measure risk values betweenadjacent categories. For example, category 1 may have a measure of riskvalue of 0.54, while category 2 may have a measure of risk value of0.62. In this particular scenario, the transition between category 1 andcategory 2 may be 0.08 (e.g., |WOE₁-WOE₂|). Typically, the measure ofrisk values between adjacent categories, sometimes referred to as databins, is small (e.g., WOE_(i)˜WOE_(i+1, i−1)), however this may notalways be this case. In some embodiments, while a Chi-Mergediscretization algorithm may be employed to determine an optimal rangeof values for each category, the corresponding transitions betweenadjacent categories may still be large.

In one embodiment, transition determination unit 612 may accessthresholds 620. Transition determination unit 612 may then determinewhether the transition between adjacent categories is greater than orequal to the accessed threshold(s) 620. If the transition is greaterthan or equal to the threshold 620, then transition determination unit612 may determine that a condition or criteria has been satisfied, andtherefore data processing may be needed to reduce the gap between themeasure of risk values for adjacent categories. In this particularscenario, transition determination unit 612 may notify processingfunction usage determiner 614 that a data processing function 618 may beneeded for processing the data. For example, an illustrative processingtechnique may correspond to data smoothing where a data smoothingfunction is applied to the category data. Persons of ordinary skill inthe art will recognize that not all category data may require dataprocessing and not all transitions may indicate processing is needed.For example, even if transitions between measure of risk values ofadjacent categories do not exceed thresholds 620, processing usagedeterminer 614 may still be employed to processes the data to generatesmoother transitions between adjacent categories. In some embodiments,data processing function 618 may further indicate a type offunction/representation to be used for the processing.

In example embodiments, data processing functions 618 may include, butare not limited to, trapezoidal functions, triangular functions, andGaussian functions. For instance, for a given feature, there may be “m”categories. A “width” of each category may be defined by Equation 2:

width_(i) =p _(i+1) −p _(i)  Equation 2.

In Equation 2, p_(i) corresponds to a boundary of the i-th category. Thesmoothing function that may be employed may be formulated based onwidth_(i) and boundary p_(i).

As an illustrative example, a trapezoidal-based processing function maybe employed. The trapezoidal-based processing function may berepresented by Equation 3:

$\begin{matrix}{{\mu_{i}(x)} = \left\{ {\begin{matrix}0 & {x < a_{i}} \\\frac{x - a_{i}}{b_{i} - a_{i}} & {a_{i} \leq x < b_{i}} \\1 & {b_{i} \leq x < c_{i}} \\\frac{d_{i} - x}{d_{i} - c_{i}} & {c_{i} \leq x < d_{i}} \\0 & {x \geq d_{i}}\end{matrix}.} \right.} & {{Equation}3}\end{matrix}$

In Equation 3, a_(i) corresponds to a lower base boundary of the i-thcategory's trapezoidal function, d_(i) corresponds to an upper baseboundary of the i-th category's trapezoidal function, b_(i) correspondsto a lower plateau boundary of the i-th category's trapezoidal function,and c_(i) corresponds to an upper plateau boundary of the i-thcategory's trapezoidal function. Furthermore, in Equation 3, μ_(i)(x)may correspond to a “membership function.” Membership functions, asdescribed herein, may correspond to a mechanism for applying thesmoothing and, as illustrated, are determined based on the number ofcategories (e.g., bins) and the boundary points for a particularfeature. For a given value of x, there will be m membership degrees,corresponding to the m categories.

In the example, some of the m membership degrees are zero, while someare non-zero. The non-zero membership degrees may be determined, and maybe multiplied by the target's values. A target value, as describedherein, corresponds to a binary value of 0 or 1, in one embodiment. Forfraudulent events, the target value is set such that y=1, whereas fornon-fraudulent events, the target value is set such that y=0. A value ofa feature as described herein, may be related to a membership degree ofthat feature. For example, a member degree may reflect an extent that aparticular feature is included within a corresponding category. Forexample, a category may have a lower bound x_1 and an upper bound x_2.The membership degree may be determined, for example, using Equation 3,where the feature's value may fall between lower bound x_1 and upperbound x_2. The membership degree, in the illustrative example, residesbetween 0 and 1. The result of this computation is represented, in oneembodiment, by Equation 4:

$\begin{matrix}{{{\overset{¯}{y}}_{i} = \frac{\Sigma_{n = 1}^{N}{\mu_{i}\left( x_{n} \right)}y_{n}}{\Sigma_{n = 1}^{N}{\mu_{i}\left( x_{n} \right)}}},{{{for}i} = 1},2,\ldots,{m.}} & {{Equation}4}\end{matrix}$

The results of Equation 4, therefore, may correspond to the smootheddata. Furthermore, by obtaining the smoothed data for each category, theencoded WOE values (e.g., smoothed measure of risk values) may bedetermined for each category. The encoded WOE values may be representedby Equation 5:

$\begin{matrix}{{woe_{i}} = {{\log\left\lbrack \frac{\Sigma_{b = {({i - 1})}}^{({i + 1})}{\mu_{b}\left( x_{n} \right)}{\overset{¯}{y}}_{b}}{{\Sigma_{b = {({i - 1})}}^{({i + 1})}{\mu_{b}\left( x_{n} \right)}} - {\Sigma_{b = {({i - 1})}}^{({i + 1})}{\mu_{b}\left( x_{n} \right)}{\overset{¯}{y}}_{b}}} \right\rbrack}.}} & {{Equation}5}\end{matrix}$

In Equation 5, the numerator reflects the number of non-fraudulentevents, while the denominator reflects the number of fraudulent events.

Processed data category generator 616, in one embodiment, may beconfigured to apply the processing function to the category data, asdescribed above, and generated processed category data. The processedcategory data may represent each categories measure of risk value afterprocessing has been performed. For example, in response to a smoothingfunction being applied to generate encoded WOE values, as represented byEquation 5, processed data category generator 616 may output datastructures for each category. The data structures output by processeddata category generator 616 may, in some embodiments, include rowscorresponding to each entry, such that for m categories there are mrows, and for each row, there are two or more columns: one columnindicating the range associated with that row, and one column indicatingthe measure of risk value associated with the post-processed categorydata.

FIG. 6B is an illustrative flowchart of an exemplary process fordetermining and applying a smoothing function to bin data, in accordancewith various embodiments of the present teaching. Process 650 may begin,in a non-limiting embodiment, at step 652. At step 652, category datamay be received. For example, category data output by measure orcategory risk value identifier 610 may be received by risk determinationsystem 214. At step 654, a risk value associated with each category maybe determined. For instance, category risk value identifier 610 maydetermine a measure of risk associated with each category.

At step 656, a transition value between risk values of adjacentcategories may be determined. For instance, transition determinationunit 612 may determine a difference between a measure of risk value forcategory 1 and category 2, category 2 and category 3, and so on. In someembodiments, transition determination unit 612 may determine whether anyof the differences equal or exceed threshold(s) 620. At step 658, aprocessing function may be determined. In some embodiments, processingusage determiner 614 may determine a data processing function 618 to useto process the category data. At step 660, the processing function maybe applied to the category data. At step 662, the processed categorydata may be generated. For example, processed data category generator616 may generate the processed category data by applying the processingfunction to the category data. At step 664, the processed category datamay be output. In some embodiments, the processed category data may besent to data bin filling system 212 to be redistributed into categoriesand then new measure of risk values be determined by measure of riskdetermination system 212. In other embodiments, the processed categorydata may be provided to measure of risk determination system 214 fordetermining new measure of risk values for each category based on theapplied processing. In still yet further embodiments, the processedcategory data may be stored within measure risk database 170.

FIG. 7A is an illustrative diagram of an exemplary event identifier, inaccordance with various embodiments of the present teaching. Eventidentifier 522, as described previously with relation to FIG. 5A, may beconfigured to determine a type of event associated with particularcategory data. Event identifier 522, in the illustrative embodiment, mayinclude a category type determiner 710, a fraudulent event analyzer 712,and a fraudulent event metadata generator 714. In some embodiments,event identifier 522 may be employed additionally or alternatively withrespect to fraudulent event identifier 510 and non-fraudulent eventidentifier 512. For example, if a determination of whether eventsincluded with category data are fraudulent has yet to occur, eventidentifier 522, as described with reference to FIG. 7A, may be employed.

Category type determiner 710 may be configured, in one embodiment, todetermine a type of category associated with the category data. Categorytype determiner 710 may receive category data, and based on the categorydata, may determine a type of category and/or a feature associated withthe category data. In some embodiments, category type determiner 710 maybe substantially similar to category type identifier 410, and theprevious description may apply.

Fraudulent event analyzer 712 may be configured to determine whethereach of the events represented within the category data are fraudulent.In some embodiments, fraudulent event analyzer 712 may access fraudulentevent parameters 716 to determine the parameters that indicate whether aparticular event is classified as being fraudulent or non-fraudulent.For example, fraudulent event parameter(s) 716 may include CTR and/orTTC, amongst other parameters. In one embodiment, the event data mayindicate a particular CTR for each event represented by the event data,and based on the CTR, fraudulent event analyzer 712 may classify eachevent as being fraudulent or non-fraudulent.

Fraudulent event metadata generator 714 may be configured to obtain theclassifications of fraudulence for each event as performed by fraudulentevent analyzer 712, and may generate metadata indicating that particularevent's fraudulence. In some embodiments, fraudulent event metadatagenerator 714 may generate a logical 1 to indicate that an event hasbeen classified as fraudulent, while a logical 0 may be generated toindicate that an event has been classified as non-fraudulent. Thefraudulent event metadata that is generated for each event may then beoutput by event identifier 522 and appended to a data structureassociated with the event's category. For example, column 444 of datastructure 440 may be populated based on the metadata generated byfraudulent event metadata generator 714. Furthermore, in someembodiments, event identifier 522 may be employed prior to analysis byrisk evaluation system 140. In this particular scenario, eventidentifier 522 may access interaction data from interaction database150, and may determine whether events represented by the interactiondata correspond to fraudulent events or non-fraudulent events. In thisparticular scenario, the determination of fraudulence may have alreadyoccurred, and therefore risk evaluation system 140 may not need toperform such a task.

FIG. 7B is an illustrative flowchart of an exemplary process fordetermining whether an event is a fraudulent event, in accordance withvarious embodiments of the present teaching. Process 750, in anon-limiting embodiment, may begin at step 752. At step 752, categorydata may be received. For example, category type determiner 710 mayreceive category data from data bin filling system 212. At step 753, acategory type may be determined. For example, category type determiner710 may determine a type of category that the category data isassociated with. At step 756, one or more fraudulent event parametersmay be obtained. For instance, fraudulent event parameter(s) 718 may beobtained based on the type of category that was determined by categorytype determiner 710. At step 758, each event included within thecategory data may be classified. In some embodiments, the classificationof each event may be based on the category type and the one or morefraudulent event parameters 718. At step 760, metadata tags for eachevent may be generated. For instance, the metadata tags may indicatewhether each event represents a fraudulent event or a non-fraudulentevent. At step 762, event fraudulent tag metadata may be output. Forinstance, the category data may be appended to include the metadata tagsfor each event included within the category data.

As an illustrative example of the process for reducing risk valuediscrepancies between categories may now be described. This example ismeant to be illustrative and not limiting.

A predictive model may include a feature, represented by X. Feature Xmay have nine categories. The nine categories for feature X may bedetermined, for example, based on a Chi-Merge discretization algorithm,as described above. The categories may have boundaries associated withthe ranges that each boundary corresponds to. Table 1 details, forexample, the nine categories associated with feature X, the boundariesassociated with feature X, and the measure of risk values for feature X.In the illustrative embodiment, the measure of risk values for feature Xcorrespond to measure of risk values determined prior to data processingbeing performed.

TABLE 1 Category Category Measure of Identifier Boundaries Risk Value 1(0, 70.85] −0.77189 2 (70.85, 125.09] −0.576893 3 (125.09, 199.99]−0.322446 4 (199.99, 321.73] −0.272197 5 (321.73, 442.66] −0.198860 6(442.66, 681.79] 0.375895 7 (681.79, 803.00] 0.464761 8 (803.00, 989.91]0.553614 9 (989.1, ∞) 0.712883

As seen from Table 1, for feature X having a value X=442 corresponds tocategory 5, and a measure of risk value (e.g., WOE value) of −0.198860.If the feature value is, instead, slightly changed, such that X=443,then X corresponds to category 6, having a measure of risk value of0.375895. The difference between these two measure of risk values,therefore, is 0.574755. Therefore, the transition from category 5 tocategory 6 is substantially large.

Inputting the measure of risk value for X=442 and X=443 into thepredictive model may yield predictive scores. In one example embodiment,the predictive score for a predictive model may be represented byEquation 6:

Predictive Score≈(797.67+100.72*Measure_of_risk)  Equation 6.

Using Equation 6, the predictive score for X=442 would be approximately778, whereas the predictive score for X=443 would be approximately 836.Therefore, a one-point difference in feature value X corresponds to a 58point predictive score variance. In some embodiments, a threshold may beset for fraudulent activity detection. This threshold, for example, maybe set at 800 in many real-word scenario. Therefore, the one pointdifference in feature value X may cause an event to be associated withsuspicious activity. By applying the processing function, as describedabove, and re-determining the measure of risk values, the measure ofrisk values may be reduced/adjusted such that such small differences donot cause large variances in predictive scores.

FIG. 7C is an illustrative graph of two example processed categories, inaccordance with various embodiments of the present teachings. Graph 770,in the illustrative embodiment, includes a feature represented by atrapezoidal functions that has been segmented into two categories,“Category 5” and “Category 6,” as described in reference to Table 1above. Category 5 has a lower bound being a first feature value (e.g.,X=321.73) and an upper bound being a second feature value (e.g.,X=442.66). Category 6 has a lower bound being the second feature value(e.g., X=442.66) and an upper bound being a third feature value (e.g.,X=681.79). Therefore, in the illustrative embodiment, Category 5 has afirst width (e.g., 442.66−321.73=120.93), while Category 6 has a secondwidth (e.g., 681.79−442.66=239.13).

Each trapezoidal function encompassed by Category 5 and Category 6includes four vertices each having a membership degree, “f.” Forinstance, Category 5 includes a lower left limit f(5)(0), an upper leftlimit f(5)(1), an upper right limit f(5)(2), and a lower right limitf(5)(3). Category 6 includes a lower left limit f(6)(0), an upper leftlimit f(6)(1), an upper right limit f(6)(2), and a lower right limitf(6)(3). The determination of each limit may be represented below byTable 2.

TABLE 2 Limit Reference Limit Representation Value f(5)(0) Category5(lower bound) − α * = 321.73 − 1./3 * 121.23 = first_width 281.32f(5)(1) Category 5(lower bound) + α * 321.73 + 121.23/3.0 = first_width362.14 f(5)(2) Category 5(lower bound) + 321.73 + 2.0 * 121.23/3.0 =2.0*α*first_width 402.55 f(5)(3) Category 5(upper bound) + α * 442.66 +1./3 * 121.23 = first_width 483.07 f(6)(0) Category 6(lower bound) − α*442.66 − 1.0/3 * 239.13 = second_width 362.95 f(6)(1) Category 6(lowerbound) + α * * 442.66 + 239.13/3.0 = second_width 522.37 f(6)(2)Category 6(lower bound) + 442.66 + 2.0*239.13/3.0 = 2.0 * α *second_width 602.08 f(6)(3) Category 6(upper bound) + α * 681.79 +1./3 * 239.13 = second_width 761.50

In Table 2, a may be set as a constant value of ⅓. Using Equation 4,values for the second measure of risk (e.g., the smoothed datarepresenting the processed measure of risk) may be generated. Forinstance, γ(5)=0.4117, γ(6)=0.7399, and γ(7)=0.7866 based on the valuesfor f(5) and f(6) described in Table 2.

For a feature value of X=443.0, however, the value for μ for bothCategory 5 and Category 6 may be determined. For example, based onEquation 3, μ(5) may be determined to equal 0.49579 (e.g.,μ(5)=1.0−(X−f(5)(2)/f(5)(3)−(f)(5)(2))), while μ(6) may be determined toequal 0.50214 (e.g., μ(6)=X−f(6)(0)/f(6)(1)−f(6)(0)). Furthermore, μ(7)may be determined to equal 0.0 based on Equation 3. To calculate the WOEfor each of Category 5 and Category 6, Equation 5 may be employed. Forinstance, as seen by Equation 7 below:

$\begin{matrix}{{WOE} = {{\left\lbrack {{{\mu(5)}^{*}\overset{\_}{y}(5)} + {{\mu(6)}^{*}\overset{\_}{y}(6)} + {{\mu(7)}^{*}\overset{\_}{y}(7)}} \right\rbrack/\text{ }\left\lbrack {{\mu(5)} + {\mu(6)} + {\mu(7)}} \right\rbrack} = {{\left\lbrack {{0.49579*0.4117} + {0.50214*0.7399} + 0.} \right\rbrack/\left\lbrack {0.49579 + 0.50214 + 0.} \right\rbrack} = {{\lbrack 0.5757\rbrack/\lbrack 0.42223\rbrack} = {0.31.}}}}} & {{Equation}7}\end{matrix}$

Employing Equation 6 with the result of Equation 7 therefore yields thesmoothed predictive score of 830 (e.g.,Predictive_Score=793.34+118.26*0.3100=830). Of note here is that,because the new smoothed data is obtained, the values for the predictivescore have changed from 797.67 to 793.34 and 100.72 to 118.26. This isdue to the smoothed data causing the model to continually update witheach new instance of newly smoothed data that is generated.

FIG. 8 is an illustrative diagram of an exemplary mobile devicearchitecture that may be used to realize a specialized systemimplementing the present teaching in accordance with variousembodiments. In this example, the user device on which the fraudulentevent detection systems and methods is implemented corresponds to amobile device 800, including, but is not limited to, a smart phone, atablet, a music player, a handled gaming console, a global positioningsystem (GPS) receiver, and a wearable computing device (e.g.,eyeglasses, wrist watch, etc.), or in any other form factor. Mobiledevice 800 may include one or more central processing units (“CPUs”)840, one or more graphic processing units (“GPUs”) 830, a display 820, amemory 860, a communication platform 810, such as a wirelesscommunication module, storage 890, and one or more input/output (I/O)devices 850. Any other suitable component, including but not limited toa system bus or a controller (not shown), may also be included in themobile device 800. As shown in FIG. 8 , a mobile operating system 870(e.g., iOS, Android, Windows Phone, etc.), and one or more applications880 may be loaded into memory 860 from storage 860 in order to beexecuted by the CPU 840. The applications 880 may include a browser orany other suitable mobile apps for determining one or more fraudulentevents on mobile device 800. User interactions with the content may beachieved via the I/O devices 850 and provided to the risk evaluationsystem 140.

To implement various modules, units, and their functionalities describedin the present disclosure, computer hardware platforms may be used asthe hardware platform(s) for one or more of the elements describedherein (e.g., rule optimization and telemetry analysis server 140). Thehardware elements, operating systems and programming languages of suchcomputers are conventional in nature, and it is presumed that thoseskilled in the art are adequately familiar therewith to adapt thosetechnologies to detect one or more fraudulent events as describedherein. A computer with user interface elements may be used to implementa personal computer (PC) or other type of work station or terminaldevice, although a computer may also act as a server if appropriatelyprogrammed. It is believed that those skilled in the art are familiarwith the structure, programming and general operation of such computerequipment and as a result the drawings should be self-explanatory.

FIG. 9 is an illustrative diagram of an exemplary computing devicearchitecture that may be used to realize a specialized systemimplementing the present teaching in accordance with variousembodiments. Such a specialized system incorporating the presentteaching has a functional block diagram illustration of a hardwareplatform, which includes user interface elements. The computer may be ageneral purpose computer or a special purpose computer. Both can be usedto implement a specialized system for the present teaching. Thiscomputer 900 may be used to implement any component of fraudulent eventdetection techniques, as described herein. For example, a fraudulentevent detection system may be implemented on a computer such as computer900, via its hardware, software program, firmware, or a combinationthereof Although only one such computer is shown, for convenience, thecomputer functions relating to fraudulent event detection as describedherein may be implemented in a distributed fashion on a number ofsimilar platforms, to distribute the processing load.

Computer 900, for example, includes COM ports 950 connected to and froma network connected thereto to facilitate data communications. Computer900 also includes a central processing unit (CPU) 920, in the form ofone or more processors, for executing program instructions. Theexemplary computer platform includes an internal communication bus 910,program storage and data storage of different forms (e.g., disk 970,read only memory (ROM) 930, or random access memory (RAM) 940), forvarious data files to be processed and/or communicated by computer 900,as well as possibly program instructions to be executed by CPU 920.Computer 900 also includes an I/O component 960, supporting input/outputflows between the computer and other components therein such as userinterface elements 980. Computer 900 may also receive programming anddata via network communications.

Hence, aspects of the methods of detecting one or more fraudulent eventsand/or other processes, as outlined above, may be embodied inprogramming. Program aspects of the technology may be thought of as“products” or “articles of manufacture” typically in the form ofexecutable code and/or associated data that is carried on or embodied ina type of machine readable medium. Tangible non-transitory “storage”type media include any or all of the memory or other storage for thecomputers, processors or the like, or associated modules thereof, suchas various semiconductor memories, tape drives, disk drives and thelike, which may provide storage at any time for the softwareprogramming.

All or portions of the software may at times be communicated through anetwork such as the Internet or various other telecommunicationnetworks. Such communications, for example, may enable loading of thesoftware from one computer or processor into another, for example, inconnection with detection of one or more fraudulent events. Thus,another type of media that may bear the software elements includesoptical, electrical and electromagnetic waves, such as used acrossphysical interfaces between local devices, through wired and opticallandline networks and over various air-links. The physical elements thatcarry such waves, such as wired or wireless links, optical links or thelike, also may be considered as media bearing the software. As usedherein, unless restricted to tangible “storage” media, terms such ascomputer or machine “readable medium” refer to any medium thatparticipates in providing instructions to a processor for execution.

Hence, a machine-readable medium may take many forms, including but notlimited to, a tangible storage medium, a carrier wave medium or physicaltransmission medium. Non-volatile storage media include, for example,optical or magnetic disks, such as any of the storage devices in anycomputer(s) or the like, which may be used to implement the system orany of its components as shown in the drawings. Volatile storage mediainclude dynamic memory, such as a main memory of such a computerplatform. Tangible transmission media include coaxial cables; copperwire and fiber optics, including the wires that form a bus within acomputer system. Carrier-wave transmission media may take the form ofelectric or electromagnetic signals, or acoustic or light waves such asthose generated during radio frequency (RF) and infrared (IR) datacommunications. Common forms of computer-readable media thereforeinclude for example: a floppy disk, a flexible disk, hard disk, magnetictape, any other magnetic medium, a CD-ROM, DVD or DVD-ROM, any otheroptical medium, punch cards paper tape, any other physical storagemedium with patterns of holes, a RAM, a PROM and EPROM, a FLASH-EPROM,any other memory chip or cartridge, a carrier wave transporting data orinstructions, cables or links transporting such a carrier wave, or anyother medium from which a computer may read programming code and/ordata. Many of these forms of computer readable media may be involved incarrying one or more sequences of one or more instructions to a physicalprocessor for execution.

Those skilled in the art will recognize that the present teachings areamenable to a variety of modifications and/or enhancements. For example,although the implementation of various components described above may beembodied in a hardware device, it may also be implemented as a softwareonly solution—e.g., an installation on an existing server. In addition,the fraudulent event detection techniques as disclosed herein may beimplemented as a firmware, firmware/software combination,firmware/hardware combination, or a hardware/firmware/softwarecombination.

While the foregoing has described what are considered to constitute thepresent teachings and/or other examples, it is understood that variousmodifications may be made thereto and that the subject matter disclosedherein may be implemented in various forms and examples, and that theteachings may be applied in numerous applications, only some of whichhave been described herein. It is intended by the following claims toclaim any and all applications, modifications and variations that fallwithin the true scope of the present teachings.

We claim:
 1. A method for determining a likelihood that a future eventis fraudulent, the method being implemented on at least one computingdevice having at least one processor, memory, and communicationscircuitry, and the method comprising: receiving event data representinga plurality of events detected by a content provider, wherein the eventdata indicates a feature associated with a corresponding event andwhether the corresponding event was fraudulent; generating category databy grouping each of the plurality of events into one of categories,wherein each of the categories is associated with a range associatedwith the feature; training a predictive model based on the categorydata; and determining, based on the predictive model, a likelihood thata future event is fraudulent.
 2. The method of claim 1, wherein thefeature corresponds to hours of a day.
 3. The method of claim 2, whereineach of the categories is associated with a distinct range of times in aday.
 4. The method of claim 1, wherein the feature corresponds to oneof: a browser cookie age, a distance between a current IP addresslocation of a user device associated with a user interaction and a homeIP address associated with a most frequently used IP address of the userdevice, a number of visits in a certain number of days, a number ofclicks for an entity in a certain number of days, an averageclick-through-rate for an entity in a certain number of days, an averagefraud in an entity in a certain number of days, and a ratio of IPaddresses over user agents on a website.
 5. The method of claim 1,further comprising: applying a smoothing function to the category datato generate smoothed data.
 6. The method of claim 5, further comprising:updating the predictive model based on the smoothed data.
 7. The methodof claim 1, further comprising: allocating a portion of a memory for adata structure including the category data.
 8. A non-transitory computerreadable medium comprising instructions for removing perturbations frompredictive scoring, wherein the instructions, when read by at least oneprocessor of a computing device, cause the computing device to perform:receiving event data representing a plurality of events detected by acontent provider, wherein the event data indicates a feature associatedwith a corresponding event and whether the corresponding event wasfraudulent; generating category data by grouping each of the pluralityof events into one of categories, wherein each of the categories isassociated with a range associated with the feature; training apredictive model based on the category data; and determining, based onthe predictive model, a likelihood that a future event is fraudulent. 9.The medium of claim 8, wherein the feature corresponds to hours of aday.
 10. The medium of claim 9, wherein each of the categories isassociated with a distinct range of times in a day.
 11. The medium ofclaim 8, wherein the feature corresponds to one of: a browser cookieage, a distance between a current IP address location of a user deviceassociated with a user interaction and a home IP address associated witha most frequently used IP address of the user device, a number of visitsin a certain number of days, a number of clicks for an entity in acertain number of days, an average click-through-rate for an entity in acertain number of days, an average fraud in an entity in a certainnumber of days, and a ratio of IP addresses over user agents on awebsite.
 12. The medium of claim 8, wherein the instructions, when readby the processor of the computing device, cause the computing device tofurther perform: applying a smoothing function to the category data togenerate smoothed data.
 13. The medium of claim 12, wherein theinstructions, when read by the processor of the computing device, causethe computing device to further perform: updating the predictive modelbased on the smoothed data.
 14. The medium of claim 8, wherein theinstructions, when read by the processor of the computing device, causethe computing device to further perform: allocating a portion of amemory for a data structure including the category data.
 15. A systemfor removing perturbations from predictive scoring, the systemcomprising: memory storing computer program instructions; and one ormore processors that, in response to executing the computer programinstructions, effectuate operations comprising: receiving event datarepresenting a plurality of events detected by a content provider,wherein the event data indicates a feature associated with acorresponding event and whether the corresponding event was fraudulent;generating category data by grouping each of the plurality of eventsinto one of categories, wherein each of the categories is associatedwith a range associated with the feature; training a predictive modelbased on the category data; and determining, based on the predictivemodel, a likelihood that a future event is fraudulent.
 16. The system ofclaim 15, wherein the feature corresponds to hours of a day.
 17. Thesystem of claim 16, wherein each of the categories is associated with adistinct range of times in a day.
 18. The system of claim 15, whereinthe operations further comprise: applying a smoothing function to thecategory data to generate smoothed data.
 19. The system of claim 18,wherein the operations further comprise: updating the predictive modelbased on the smoothed data.
 20. The system of claim 15, wherein theoperations further comprise: allocating a portion of a memory for a datastructure including the category data.